伴学插件支持文本框图片粘贴与视觉输入#1603
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughPR 为学习伴侣插件添加视觉图像输入支持:新增共享校验并在三处入口接入,前端实现粘贴压缩与 UI 绑定,扩展样式与单元测试覆盖喵。 Changes视觉图像输入端到端支持
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
When the UI sends a pasted image with an empty text box, this new condition lets the request proceed, but study_explain_text then calls concept_explain(source_text, ...) with source_text == ""; tutor_llm_agent_concept_explain.concept_explain immediately returns the empty_input degraded reply before it attaches vision_image_base64. In the image-only paste path, the model never sees the image and the user gets an empty-input response, so pass a small prompt as source_text or teach the agent to invoke vision when only the image is present.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| used_ocr_fallback = bool(source_text.strip()) | ||
| source_text = source_text.strip() | ||
| if not source_text: | ||
| if not source_text and not vision_image_payload: |
There was a problem hiding this comment.
Invoke vision for image-only question generation
This now accepts vision_image_base64 without any text/OCR, but in that scenario source_text remains empty and the later call to self._agent.question_generate(source_text, ...) short-circuits in tutor_llm_agent_question_generate.question_generate to the empty_input fallback before _invoke_structured_operation can attach the image. Users pasting only a diagram/photo and clicking Generate Question therefore get the generic empty-input fallback instead of a vision-generated question.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
plugin/plugins/study_companion/surfaces/study_panel.tsx (1)
110-110: ⚡ Quick win将超时错误消息国际化喵~
发现在 Line 110 这里,超时错误消息
"图片加载超时"是硬编码的中文字符串喵。虽然这个错误只会在开发控制台显示(不会直接展示给用户),但为了保持代码库的国际化一致性,建议也使用翻译系统喵。其他地方的错误消息都通过t()函数翻译了(比如 Line 722-723),这里也应该保持一致才对喵~如果有非中文背景的开发者在调试时看到这个消息,可能会有点困惑喵。
🌸 建议的修改方式喵
可以考虑将超时消息改为英文(因为这是开发控制台消息):
const timeoutPromise = new Promise<never>((_, reject) => { - timeoutId = window.setTimeout(() => reject(new Error('图片加载超时')), timeoutMs); + timeoutId = window.setTimeout(() => reject(new Error('Image load timeout')), timeoutMs); });或者如果团队希望所有消息都支持i18n,也可以考虑传入一个翻译后的消息参数喵~
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/plugins/study_companion/surfaces/study_panel.tsx` at line 110, Replace the hard-coded Chinese timeout message used in the setTimeout rejection with a translated message via the existing i18n helper (use t(...)); specifically update the timeout callback that calls reject(new Error('图片加载超时')) so it calls reject(new Error(t('image.load_timeout'))) or another appropriate i18n key/value, keeping the same symbols (timeoutId, reject, timeoutMs) and ensuring the t() function is imported/available in study_panel.tsx before use.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@plugin/plugins/study_companion/surfaces/study_panel.tsx`:
- Line 110: Replace the hard-coded Chinese timeout message used in the
setTimeout rejection with a translated message via the existing i18n helper (use
t(...)); specifically update the timeout callback that calls reject(new
Error('图片加载超时')) so it calls reject(new Error(t('image.load_timeout'))) or
another appropriate i18n key/value, keeping the same symbols (timeoutId, reject,
timeoutMs) and ensuring the t() function is imported/available in
study_panel.tsx before use.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 52c29d29-628b-4890-b38e-34ec568bb7bf
📒 Files selected for processing (8)
plugin/plugins/study_companion/entry_common.pyplugin/plugins/study_companion/entry_tutor_answer_entries.pyplugin/plugins/study_companion/entry_tutor_explain_entries.pyplugin/plugins/study_companion/entry_tutor_question_entries.pyplugin/plugins/study_companion/static/style.cssplugin/plugins/study_companion/surfaces/study_panel.tsxplugin/tests/unit/plugins/test_study_companion.pyplugin/tests/unit/plugins/test_study_companion_vision.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
plugin/plugins/study_companion/entry_tutor_explain_entries.py (1)
21-21: 💤 Low value仅图片场景的提示词是硬编码英文,和同插件里的中文回退不一致喵。
L54 的
study_submit_image用的是中文回退"请查看这张图片的内容",而这里新加的IMAGE_ONLY_EXPLAIN_PROMPT却是固定英文"Please explain the pasted image."。插件本身有self._cfg.language,对中文用户来说这个 prompt 会作为source_text进入历史/上下文,体验上有点割裂喵~ 哼,要不要按language做下本地化呀,笨蛋作者~(不改也不会炸,纯属强迫症提醒喵)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py` at line 21, IMAGE_ONLY_EXPLAIN_PROMPT is hardcoded in English while study_submit_image uses a Chinese fallback; make the prompt respect the plugin language setting by replacing the constant with language-aware selection (use self._cfg.language inside the same module/class) or derive it from the same Chinese fallback ("请查看这张图片的内容") when language is 'zh' (and a suitable English string otherwise), and ensure IMAGE_ONLY_EXPLAIN_PROMPT (or its replacement) is used consistently by study_submit_image and any other callers so the source_text matches the user's language.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py`:
- Line 21: IMAGE_ONLY_EXPLAIN_PROMPT is hardcoded in English while
study_submit_image uses a Chinese fallback; make the prompt respect the plugin
language setting by replacing the constant with language-aware selection (use
self._cfg.language inside the same module/class) or derive it from the same
Chinese fallback ("请查看这张图片的内容") when language is 'zh' (and a suitable English
string otherwise), and ensure IMAGE_ONLY_EXPLAIN_PROMPT (or its replacement) is
used consistently by study_submit_image and any other callers so the source_text
matches the user's language.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 28b64454-a4f0-4016-bebf-be5bb62683f2
📒 Files selected for processing (5)
plugin/plugins/study_companion/entry_tutor_explain_entries.pyplugin/plugins/study_companion/entry_tutor_question_entries.pyplugin/plugins/study_companion/surfaces/study_panel.tsxplugin/tests/unit/plugins/test_study_companion.pyplugin/tests/unit/plugins/test_study_companion_vision.py
✅ Files skipped from review due to trivial changes (1)
- plugin/plugins/study_companion/surfaces/study_panel.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
- plugin/tests/unit/plugins/test_study_companion.py
- plugin/plugins/study_companion/entry_tutor_question_entries.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8851217876
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if not source_text and vision_image_payload: | ||
| source_text = _image_only_question_prompt(self._cfg.language) | ||
| image_only_source = True |
There was a problem hiding this comment.
Prefer pasted image over stale OCR for question generation
When the user pastes only an image while last_ocr_text is still populated from an earlier snapshot, the earlier OCR fallback fills source_text before this image-only branch runs, so the branch is skipped and the request is labeled ocr_snapshot rather than vision_image. In that scenario, Generate Question sends stale OCR text alongside the new image instead of using the image-only prompt, which can produce questions about the previous screen; skip the OCR fallback whenever vision_image_base64 is supplied without explicit text.
Useful? React with 👍 / 👎.
| if not source_text: | ||
| source_text = _image_only_explain_prompt(self._cfg.language) | ||
| image_only_source = True |
There was a problem hiding this comment.
Prefer pasted image over stale OCR for image-only explanations
For an image-only Explain action with a previous OCR snapshot in state, source_text has already been filled from last_ocr_text before this check, so the image-only prompt is never selected and the context source remains ocr_snapshot. This makes a pasted image get explained with stale OCR text from a prior screen; the OCR fallback should only run when no pasted vision image was provided.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
plugin/plugins/study_companion/entry_tutor_explain_entries.py (1)
21-32: ⚡ Quick winEN 与 ZH 提示词的指令语义不一致喵~
英文版让模型「explain(解释)」图片,但中文版(
请查看这张图片的内容/請查看這張圖片的內容)只是让模型「查看内容」,作为「解释/讲解」功能的注入提示词,给到视觉模型的指令方向就不一样了喵。这会让不同语言下的输出风格/质量出现偏差,建议把中文也对齐成「解释」语气,杂鱼开发者可不要偷懒喵~♻️ 建议对齐中文提示词语气
-IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN = "请查看这张图片的内容" -IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW = "請查看這張圖片的內容" +IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN = "请解释这张图片的内容" +IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW = "請解釋這張圖片的內容"注意:上下文测试
test_study_explain_text_uses_prompt_for_image_only(test_study_companion_vision.py:942-952)硬编码断言了请查看这张图片的内容,如果采纳本改动,对应断言也要同步更新喵。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py` around lines 21 - 32, The Chinese prompt strings are phrased as "please view" rather than "please explain," causing inconsistent instruction semantics; update IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN and IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW to match the English "explain" intent (e.g., use a phrasing like "请解释这张图片的内容" / "請解釋這張圖片的內容") and ensure the helper _image_only_explain_prompt still returns those constants unchanged; if you change these values, also update any tests that hard-code the old Chinese string (the test referencing test_study_explain_text_uses_prompt_for_image_only).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@plugin/plugins/study_companion/entry_tutor_explain_entries.py`:
- Around line 21-32: The Chinese prompt strings are phrased as "please view"
rather than "please explain," causing inconsistent instruction semantics; update
IMAGE_ONLY_EXPLAIN_PROMPT_ZH_CN and IMAGE_ONLY_EXPLAIN_PROMPT_ZH_TW to match the
English "explain" intent (e.g., use a phrasing like "请解释这张图片的内容" / "請解釋這張圖片的內容")
and ensure the helper _image_only_explain_prompt still returns those constants
unchanged; if you change these values, also update any tests that hard-code the
old Chinese string (the test referencing
test_study_explain_text_uses_prompt_for_image_only).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 036e655b-4638-4ffd-8d0c-69609bdca2fd
📒 Files selected for processing (3)
plugin/plugins/study_companion/entry_tutor_explain_entries.pyplugin/plugins/study_companion/entry_tutor_question_entries.pyplugin/tests/unit/plugins/test_study_companion_vision.py
🚧 Files skipped from review as they are similar to previous changes (2)
- plugin/tests/unit/plugins/test_study_companion_vision.py
- plugin/plugins/study_companion/entry_tutor_question_entries.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 93ca0c8c90
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (image === null) { | ||
| setters.setPasteError(setters.pasteErrorMessage); | ||
| } else { | ||
| setters.setImage(image); |
There was a problem hiding this comment.
Clear stale OCR when accepting image-only paste
When the textarea has been auto-filled from last_ocr_text by refresh() and the user then pastes only a new image, this accepts the image but leaves the old OCR text in text; the Generate/Explain handlers send both { text, vision_image_base64 }, so the backend treats the stale OCR as manual context instead of taking its image-only path. This still reproduces the stale-OCR-with-new-image behavior even though the entry methods now skip OCR fallback when the text is empty.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3df8ca2ca7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const controller = beginStudyRequest(); | ||
| setBusy(true); | ||
| const evalArgs: Record<string, unknown> = { answer, question }; | ||
| if (answerImage) evalArgs.vision_image_base64 = answerImage; |
There was a problem hiding this comment.
Allow image-only answers to reach evaluation
When the learner pastes an answer image but leaves the answer textarea empty, the earlier if (!answer.trim()) guard returns before this image payload is added, so the new answer-image path is unreachable for handwritten/diagram answers unless the user types filler text. Since the backend now accepts vision_image_base64 without requiring a non-empty answer, the UI should treat answerImage as satisfying the answer requirement.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f298f1095
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| setText((prev) => { | ||
| if (prev.trim() || !data.last_ocr_text) { | ||
| return prev; | ||
| } | ||
| textAutoFilledFromOcrRef.current = true; | ||
| return data.last_ocr_text; |
There was a problem hiding this comment.
Prevent OCR autofill after image submissions
When an image-only Explain/Generate request completes while last_ocr_text still contains an older snapshot, the subsequent refresh(..., { updateReply: false }) runs before the pasted image is cleared and this block repopulates the empty textarea with that stale OCR. The next action is then sent as manual text instead of the pasted-image context, reintroducing stale-OCR behavior after a successful image-only submission; skip this autofill when a text image is present or when the request just consumed an image.
Useful? React with 👍 / 👎.
| try { | ||
| const image = await compressImageForStudy(blob, signal); | ||
| if (signal.aborted || !isMounted()) { | ||
| return; | ||
| } | ||
| if (image === null) { | ||
| setters.setPasteError(setters.pasteErrorMessage); | ||
| } else { | ||
| setters.onImageAccepted?.(); | ||
| setters.setImage(image); |
There was a problem hiding this comment.
Block actions until paste compression finishes
For large pasted JPEG/PNG images, compression is asynchronous and the action buttons remain enabled until setImage runs. If the user pastes an image and immediately clicks Explain/Generate/Evaluate, those handlers read textImage/answerImage while still empty and submit the request without the pasted image, so image-only pastes can fail or use stale text. Track a paste-pending state (or disable/abort actions) until this await completes and the image state is committed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6319e2bc82
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (textImage || prev.trim() || !data.last_ocr_text) { | ||
| return prev; |
There was a problem hiding this comment.
Use a ref when suppressing OCR autofill
If a refresh() call that was started before the user pastes an image resolves after setTextImage(image), this closure still sees the old textImage === '', so the guard lets last_ocr_text populate an otherwise empty textarea. That leaves a newly pasted image paired with stale OCR as manual text for the next Explain/Generate call; fresh evidence here is that the new autofill guard reads captured React state instead of a synchronously updated image ref.
Useful? React with 👍 / 👎.
|
这个 PR 主要把“文本框粘贴图片作为伴学视觉输入”的能力实现并收敛在 做了什么:
为什么这样做:
验证:
|
解决 study_panel.tsx 冲突:保留 PR 的图片粘贴 ref 组与 data-busy 忙碌态属性,同时合入 main(Project-N-E-K-O#1606 Phase 9) 新增的 panelRef 与无障碍 role/aria-label 属性。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
变更内容
本分支将图片粘贴能力收敛在
study_companion插件内部,实现伴学面板文本框粘贴图片后,可作为视觉输入参与讲解、出题和答案评价。主要改动:
study_generate_question和study_evaluate_answer支持vision_image_base64参数。study_generate_question补充外层异常兜底,保留操作上下文。验证
uv run pytest plugin/tests/unit/plugins/test_study_companion_vision.py -quv run pytest plugin/tests/unit/plugins/test_study_companion.py -quv run ruff check ...esbuild plugin/plugins/study_companion/surfaces/study_panel.tsxgit diff --checkdetect_changes:LOW,0 个受影响流程风险说明
变更范围已收敛在
study_companion插件内部及其插件测试文件,不涉及主聊天 Composer 或全局 paste 处理。Summary by CodeRabbit
新功能
样式
测试